Quantum Machine Learning for Distributed Quantum Protocols with Local Operations and Noisy Classical Communications

Distributed quantum information processing protocols such as quantum entanglement distillation and quantum state discrimination rely on local operations and classical communications (LOCC). Existing LOCC-based protocols typically assume the availability of ideal, noiseless, communication channels. In this paper, we study the case in which classical communication takes place over noisy channels, and we propose to address the design of LOCC protocols in this setting via the use of quantum machine learning tools. We specifically focus on the important tasks of quantum entanglement distillation and quantum state discrimination, and implement local processing through parameterized quantum circuits (PQCs) that are optimized to maximize the average fidelity and average success probability in the respective tasks, while accounting for communication errors. The introduced approach, Noise Aware-LOCCNet (NA-LOCCNet), is shown to have significant advantages over existing protocols designed for noiseless communications.


Motivation
Distributed quantum computing is considered to be an important application for the quantum Internet, offering a path forward towards scalable quantum computers [1]. A practically and theoretically relevant class of distributed quantum computing protocols relies on local quantum operations and classical communications (LOCC) [2,3]. In LOCC-based protocols, distributed nodes carry out local quantum processing steps that are interwoven with the exchange of classical information, i.e., bits. LOCC-based protocols have been designed for a variety of tasks, including entanglement distillation, state discrimination and channel simulation [4][5][6][7].
Recently, a quantum machine learning (QML) framework was introduced in [8] for the design of LOCC protocols. The approach, termed LOCCNet, is motivated by the difficulty of designing optimal LOCC protocols under the restrictions imposed by noisy intermediate scale quantum (NISQ) computers. Following the QML framework [9,10], LOCCNet prescribes the use of parameterized quantum circuits (PQCs) for local processing. PQCs have been widely investigated in recent years as means to program NISQ computers via classical optimization, with applications ranging from combinatorial optimization to generative modelling [9]. A PQC typically consists of a sequence of one-and two-qubit rotations, whose parameters can be optimized, as well as of fixed entangling gates.
Existing LOCC-based protocols, including the QML-based schemes introduced in [8], assume the availability of ideal, noiseless, communication channels. In contrast, in this paper, we study the case in which classical communication takes place over noisy channels. We introduce an approach, referred to as Noise Aware-LOCCNet (NA-LOCCNet), that addresses the design of LOCC protocols in the presence of noisy classical channels via the use of QML tools. We specifically focus on the important tasks of quantum entanglement Traditionally, entanglement distillation protocols have been designed by hand, targeting specific mixed states as the input of the protocol [4,12,17]. Specific examples include the DEJMPS protocol, which targets the so-called S-state [17]. These methods rely on local operations via specific unitaries; on the measurement of one qubit at Alice and Bob; and on the classical communication of the measurement outputs on a noiseless channel. Based on the measurement outputs, Alice and Bob decide whether to keep the unmeasured pair of qubits or to declare a distillation failure.
Recently, as illustrated in Figure 1, the LOCCNet framework introduced in [8] for the design of LOCC protocols prescribes the use of PQCs for the local unitaries applied by Alice and Bob. LOCCNet assumes ideal classical communications, while this paper studies the case in which communications between the parties holding imperfectly entangled qubits takes place over a noisy channel. To address this more challenging scenario, the proposed NA-LOCCNet method leverages to adapt QML tools to program local operations via PQCs while accounting for the channel noise.
Traditionally, quantum state discrimination protocols based on LOCC protocols have been designed by hand by focusing on the discrimination of specific pairs of states. Specific examples include the discrimination of orthogonal pure states [5] and the discrimination of maximally entangled states [22]. Assuming the presence of two nodes, Alice and Bob, these methods select the unitary at Bob as a function of the output of measurements made by Alice and shared on a noiseless communication link with Bob.
Reference [8] also introduced the LOCCNet framework for quantum state discrimination. The design of LOCCNet in [8] considers the problem of distinguishing two orthogonal maximally entangled Bell states, where one of the Bell state is corrupted by an entanglementbreaking quantum channel. The design assumes ideal, noiseless, classical communications, and it operates on a single pair of qubits. As a second contribution, this paper introduces the NA-LOCCNet framework for quantum state discrimination by accounting for noisy classical communications in the design problem.

Main Contributions
As summarized in the previous subsections, the design of LOCCNet in [8] assumes ideal, noiseless and classical communications. In contrast, in this paper, we study the case in which communication takes place over noisy binary symmetric channels. The specific contributions are as follows.

•
As observed in Figure 1, we first introduce NA-LOCCNet as a novel PQC-based architecture for the distributed entanglement distillation (see Figure 4) that is designed with the goal of maximizing the average fidelity while accounting for the randomness caused by communication errors. • Then, we adapt the NA-LOCCNet framework for the problem of the distributed quantum state discrimination (see Figure 9), with the goal of maximizing the average probability of successful detection for quantum state discrimination. • The introduced NA-LOCCNet is shown via experiments to have significant advantages over existing protocols designed for noiseless communications. Furthermore, in quantum state discrimination, we make the important observation that, depending on the level of classical noise, a larger level of entanglement-breaking noise can be advantageous to facilitate successful distributed discrimination.
Part of this paper was presented in [23], which covered only NA-LOCCNet for entanglement distillation.

Organization
The rest of the paper is organized as follows. In Section 2, we present the NA-LOCCNet protocol for the distributed entanglement distillation, while Section 3 focuses on the NA-LOCCNet protocol for the distributed quantum state discrimination. In both sections, we first define the problem statement, review the relevant state of the art, present the proposed NA-LOCCNet protocol, and finally give experimental results. Section 4 concludes the paper.

Notations and Definitions
For any non-negative integer K, [K] represents the set {0, 1, · · · , K}. Given a discrete set A and positive integer S, A S represents the set of strings of length S from the alphabet A. The Kronecker product is denoted as ⊗; I d represents the d × d identity matrix; M † represents the complex conjugate transpose of the matrix M; tr(M) represents the trace of the matrix M; and a positive semidefinite matrix M is denoted as M 0. We adopt standard notations for quantum states, computational basis and quantum gates [13]. Let

Learning Entanglement Distillation with Noisy Classical Communication
In this section, we first formulate the distributed entanglement distillation problem and review the relevant state-of-the-art protocols. We then propose NA-LOCCNet for the distributed entanglement distillation and give experimental results.

Problem Formulation
In this subsection, we formulate the problem of the distributed entanglement distillation in the presence of a noisy classical communication channel, and we describe the performance metrics of interest.

Setting
As illustrated in Figure 1, we consider a system consisting of two main parties-Alice and Bob-aided by a third party-Charlie. Alice and Bob have local quantum processing capability, while Charlie is not equipped with quantum computing devices. Alice and Bob can communicate to Charlie over a noisy classical channel. An imperfect quantum entanglement mechanism generates pairs of noisy entangled qubits, also referred to as noisy ebits. One of the qubits of each entangled pair is made available to Alice and the other to Bob. The goal of the system is to improve the average fidelity, defined in Sections 2.1.2 and 2.3.1, of the noisy ebits shared by Alice and Bob through local operations (LO) at Alice and Bob, as well as through classical communication (CC) to Charlie.
The quantum entanglement generator produces k pairs of noisy ebits. The state of each qubit pair is described by a 4 × 4 density matrix ρ AB . Throughout the paper, we use subscript A to denote the qubits available at Alice, while the subscript B is used for the qubits at Bob. As in [8], we specifically focus on the noisy, i.e., mixed, ebit state described by the density matrix where F ∈ [0, 1] represents the input fidelity and is a maximally entangled Bell state. The noisy ebit state in (1) is also known as S-state [8], and it describes a situation in which the two qubits are in the maximally entangled state, |φ + , with probability F, and in the separable, i.e., non-entangled, state |00 with probability 1 − F. This type of noisy state arise in the protocols for entanglement generation that use single-photon detection in the presence of photon loss [16,24,25]. Furthermore, the S state is known to be more challenging to "denoise" than other mixed states in which the separable state, occurring with probability 1 − F, is orthogonal to |φ + [24]. As in [8], we focus on the standard case in which k = 2 identical pairs of S-states ρ A 0 B 0 and ρ A 1 B 1 are generated. The goal is to distill the two noisy ebits pairs to obtain a single pair of less noisy ebits. Following the standard terminology [12], the qubits A 0 and B 0 are referred to as the preserved pair, and the qubits A 1 and B 1 as the sacrificial pair. As shown in Figure 1, Alice and Bob process the respective qubits-A 1 and A 0 for Alice, and B 1 and B 0 for Bob-via local quantum operations defined by unitaries U A (θ) and U B (θ), respectively. As detailed in the next sections, the operation of the unitaries generally depend on a vector θ of classical parameters. Then, the qubits A 1 and B 1 are measured in the computational basis at Alice and Bob, respectively, and the measurement outcomes (0 or 1) are communicated to Charlie using noisy classical channels. We specifically assume that communication to Charlie occurs over independent binary symmetric channels with bit flip probability p.
If Charlie receives message 0 from both Alice and Bob, it declares that the distillation is successful, and Alice and Bob retain the pair of qubits A 0 and B 0 . Instead, if Charlie receives the pairs of messages (0, 1), (1, 0) or (1, 1) from Alice and Bob, it declares a failure. In this case, Alice and Bob discard the qubits A 0 and B 0 .
We remark that most conventional entanglement distillation protocols [4,17] use decision rules in which either pair of messages (0, 0) or (1, 1) is considered as success. Here, we follow the approach in [8] of treating (0, 0) as the only case in which Charlie declares success. This design choice facilitates the optimization of the unitaries U A (θ) and U B (θ) through vector θ.
One of the goals of this work is to design the unitaries U A (θ) and U B (θ) at Alice and Bob such that the output state of qubits A 0 and B 0 , upon successful distillation, is as close as possible in terms of fidelity to the ideal ebit state |φ + .

Performance Metrics
The performance of entanglement distillation is measured in this paper, as in [8,26], in terms of fidelity and probability of success. The fidelity of a state ρ AB with respect to the ebit state |φ + is defined as while probability of success is the probability of receiving the pair of messages (0, 0) at Charlie. Let U(θ) be the 16 × 16 unitary operation corresponding to the separate application of the 4 × 4 local unitaries U A (θ) and U B (θ) to their respective qubit pairs (A 0 , A 1 ) and (B 0 , B 1 ), respectively. We order the qubits as (A 0 , B 0 , A 1 , B 1 ) to facilitate the derivations below. The state of the four qubits after the local operations can be expressed as the density matrix where we have made explicit dependence on the model parameter vector θ.

Existing Distillation Protocols
In this section, we review current state-of-the-art distillation protocols. We focus on the DEJMPS protocol [17] and on the LOCCNet protocol [8] as applied to k = 2 copies of the S-state (1). We emphasize that all the existing distillation protocols are designed for noiseless classical communication channels to Charlie, i.e., assuming p = 0.

DEJMPS Protocol
In the DEJMPS protocol, the local unitaries U A (θ) and U B (θ) applied by Alice and Bob do not have free parameters, and are hence denoted as U A and U B , dropping the dependence on the model parameter vector θ. Specifically, the unitary U A at Alice is given by Pauli X-rotation R X (π/2) applied on both qubits, followed by a controlled NOT (CNOT) gate with the qubit A 0 as the control and the qubit A 1 as the target. Similarly, the unitary U B at Bob is defined by the cascade of Pauli X-rotations R X (−π/2) on the two qubits and of a CNOT gate with the qubit B 0 as the control and the qubit B 1 as the target. If Charlie receives messages (0, 0) or (1, 1) from Alice and Bob, it declares that the distillation is successful, and the qubit pair (A 0 , B 0 ) is retained.

LOCCNet
In [8], a quantum machine learning (QML)-based entanglement distillation protocol, known as LOCCNet, is introduced that uses parameterized quantum circuits (PQCs) for unitaries U A (θ) and U B (θ) at Alice and Bob. As illustrated in Figure 3, the PQC U A (θ) consists of a CNOT gate followed by a Pauli Y-rotation; while the PQC U B (θ) is given by two CNOT gates followed by a Pauli Y-rotation. The rotation angle θ of the Pauli Y-rotation is subject to optimization. If Charlie receives messages (0, 0) from Alice and Bob through noiseless channels, i.e., p = 0, a success is declared and the pair (A 0 , B 0 ) of qubits is retained. Model parameter vector θ is optimized with the goal of maximizing the fidelity F 00 (θ) in (8).

Noise Aware-LOCCNet
In this section, we propose Noise Aware-LOCCNet (NA-LOCCNet), which distills two qubit pairs, each in the S-state (1), in the presence of noisy classical channels from Alice and Bob to Charlie, as shown in Figure 1. The key innovation as compared to LOCCNet is that we explicitly target the performance in terms of the average fidelity by accounting for the impact of channel errors. We first describe the design objective, and then introduce the assumed structure for the PQCs U A (θ) and U B (θ).

Design Objective
NA-LOCCNet aims at maximizing the average conditional fidelity of a retained pair (A 0 , B 0 ) in the case of success. As explained in Section 2.1.1, Charlie declares a success if it receives the pair of messages (0, 0) from Alice and Bob through the respective binary symmetric channels with bit flip probability p. LOCCNet assumes a noiseless channel (p = 0), and hence it targets the objective F 00 (θ), that is, the fidelity conditioned on measurement (0, 0) being produced by Alice and Bob. In contrast, NA-LOCCNet accounts for the fact that, where Charlie declares a success as it receives messages (0, 0), the actual measurement outcomes may be different due to channel errors.

Architecture of the PQCs
For the PQCs U A (θ) and U B (θ) at Alice and Bob, respectively, we adopt the architecture shown in Figure 4. Unlike the LOCCNet architecture in Figure 3, we introduce a parameterized two-qubit gate, namely the Pauli ZY-rotation [27]. This is defined by the unitary which is parameterized by angle θ. Recently, two-qubit rotation gates [27] were demonstrated to provide performance advantages as gates in PQCs for various quantum machine learning applications. In our work, the choice of the parameterized two-qubit gate (12) was dictated by extensive experiments with alternative architectures. We tried various other ansatzes with different two qubit and single qubit rotation gates, changing the position of CNOT gate before and after the rotation gates, and changing the control and target qubits of CNOT gates. We note that the proposed ansatz in Figure 4 gives the best performance among the ansatzes we considered. As an example, in Section 2.4, we will compare the performance obtained by the architecture in Figure 4 with the original LOCCNet system in Figure 3, when addressing problem (11). The proposed architecture has the same complexity in terms of the number of parameters as that of LOCCNet [8].We note that one could also consider ansatzes with more rotation angles for single qubit and two qubit rotation gates at Alice and Bob, and we leave an investigation of this point to future work.

Optimization
Addressing problem (11) using QML with PQCs characterized by a single scalar parameter θ, as for the architectures in Figures 3 and 4, requires a one-dimensional search over the limited domain [0, 2π). This can be carried out using standard optimization techniques, including the grid search or gradient descent. In particular, we use the Adam gradient descent optimizer [28] with a 0.01 learning rate and 1001 iterations. Similar to the vast majority of papers on quantum machine learning (see, e.g., [8,29]), the optimization is at the level of parameters, here θ, of quantum gates. Implementation on a quantum computer requires a compilation step that accounts for the physical realization of the specific hardware [30].

Experiments
In this section, we evaluate the performance of the proposed NA-LOCCNet protocol in the presence of noisy communication channels from Alice and Bob to Charlie. We consider the benchmark schemes DEJMPS (Section 2.2.1) and LOCCNet (Section 2.2.2). For the latter, we consider two designs: the original optimization in [8] of the fidelity F 00 (θ) in (8) and the optimization of the conditional average fidelity F(θ) in (9) for the PQC architecture in Figure 3. Figure 5 plots the average output fidelity, conditioned on a successful distillation, as a function of the bit flip probability p of the noisy classical channels by fixing the input fidelity of the S-state (1) to F = 0.6; while Figure 6 plots the same quantity as a function of the input fidelity F by fixing the bit flip probability to p = 0.25. Note that the conditional average fidelity is given by (9) for LOCCNet and NA-LOCCNet, while for DEJMPS one needs to consider both received messages (0, 0) and (1, 1) as indicating a successful distillation.   Figure 5 shows that, as the bit flip probability p increases, the average fidelity of both DEJMPS and LOCCNet decreases significantly, reaching the minimum fidelity of 0.5 when the channels are maximally noisy, i.e., with p = 0.5. Note that this fidelity level is smaller than the input fidelity F = 0.6. Interestingly, the performance of the LOCCNet architecture in Figure 3 does not improve noticeably when optimized via the channel-aware criterion (11), as opposed to the noise-agnostic fidelity criterion considered in [8]. In contrast, the proposed NA-LOCCNet with PQC architecture in Figure 4 exhibits a significantly milder decrease in fidelity as p grows, yielding the average output fidelity level of F = 0.8 for p = 0.5.
The advantages of NA-LOCCNet are further validated by Figure 6, which shows gains at all values of the input fidelity F. In particular, unlike the other schemes, NA-LOCCNet never yields an output fidelity lower than the input fidelity F.
It is finally noted that the proposed approach, as well as LOCCNet [8], targets the fidelity performance and not the probability of success. This point is illustrated in Figure 7, which shows the probability of success-given by (10) for LOCCNet and NA-LOCCNet and by the sum of the probabilities for receiving the messages (0,0) and (1,1) at Charlie for DEJMPS-as a function of the input fidelity F for p = 0.25. Overall, NA-LOCCNet is observed to offer a comparable probability of success as compared to LOCCNet, while improving the average fidelity.

Learning Quantum State Discrimination with Noisy Classical Communication
In this section, we first formulate the distributed quantum state discrimination problem and review the relevant state-of-the-art protocols. We then propose NA-LOCCNet for distributed quantum state discrimination and give experimental results.

Setting and Performance Metrics
As in [8], we study the distributed quantum state discrimination problem illustrated in Figure 2. In it, two agents, Alice and Bob, observe pairs of entangled qubits, and are tasked with detecting the joint quantum state of the qubit pairs. To this end, Alice and Bob can carry out local operations (LOs), as well as classical communication (CC) from Alice to Bob, i.e., they can implement an LOCC protocol. Unlike [8], we assume that the CC link between Alice and Bob is noisy. Applications of this setting include quantum sensor networks, as well as diagnostic functionalities for entanglement testing in the quantum internet [11,18,21].

Setting
Alice and Bob share S qubit pairs (A s , B s ) with s ∈ [S − 1], where each qubit A s is at Alice and each qubit B s is at Bob. Each qubit pair (A s , B s ) is entangled in one of two possible ways: The joint state of each pair (A s , B s ) is either given by the density matrix ρ 0 = |Φ + Φ + |, with a maximally entangled Bell state |Φ + = (|00 + |11 )/ is an amplitude damping (AD) channel and |Φ − = (|00 − |11 )/ √ 2 is a maximally entangled Bell state orthogonal to |Φ + . The AD channel applies separately to the two qubits, and is expressed as where E ij = E i ⊗ E j with Kraus matrices and where 0 ≤ γ ≤ 1 represents the noise parameter of the AD channel. For γ = 0, the AD channel does not alter the input Bell state |Φ − , whereas for γ = 1, the AD channel breaks the entanglement of the Bell state |Φ − , converting it to the product state |00 . From [8], it is enough to consider the AD channel on a maximally entangled state, i.e., |Φ − , to make the two states, ρ 0 and ρ 1 , non-orthogonal. We note that results in this paper apply at a qualitative level to any other entanglement-breaking channel [14]. As observed in Figure 2, Alice applies a parameterized quantum circuit (PQC) to the S qubits A 0 , A 1 , · · · , A S−1 in her possession; then, it measures the S qubits, and sends the S classical bits obtained from the measurements to Bob. The PQC applied by Alice implements a 2 S × 2 S unitary matrix U A (θ A ) that is parameterized by vector θ A . Given that the input state for each qubit pair is ρ i , with i ∈ {0, 1}, the corresponding output state for the 2S qubits A 0 , A 1 , · · · , A S−1 and B 0 , B 1 , · · · , B S−1 can be written as where I B is a 2 S × 2 S identity matrix. The notation ρ ⊗S i represents the state of the S qubit pairs, with qubits ordered so that Alice qubits A 0 , A 1 , · · · , A S−1 are listed prior to Bob's qubits B 0 , B 1 , · · · , B S−1 .
Furthermore, Alice measures her qubits A 0 , A 1 , · · · , A S−1 using the 2 S projection matrices Π A a = |a a| ⊗ I with a ∈ {0, 1} S , where |a is the computational basis vector corresponding to the bit string a. The measurement returns the output a ∈ {0, 1} S , with a probability given by the Born rule, i.e., Note that the probability (17) is conditioned on the true initial state ρ i of the qubit pairs. Alice communicates the S classical bits a ∈ {0, 1} S obtained from the measurement to Bob through a memoryless binary symmetric channel with bit-flip probability p. As a result, Bob receives a messageâ ∈ {0, 1} S with probability P A→B a|a = p d a,â (1 − p) S−d a,â , where d a,â is the Hamming distance between the bit strings a andâ. We note that this model can also account for the measurement noise at Alice [31,32]. Therefore, the probability of receiving messageâ at Bob, when the qubit pairs initial state is ρ i , is given by and the corresponding 2 S × 2 S post-measurement density state of the S qubits at Bob is Depending on the messageâ ∈ {0, 1} S received at Bob, Bob performs a local operation given by the unitary U B (θ B a ), leaving the S qubits in his possession in the density state Finally, Bob applies a parity projective measurement on the S qubits, using the projection matrices Π B 0 = ∑ even b |b b| and Π B 1 = ∑ odd b |b b|, where "even" and "odd" refer to the number of 1's in the bit string b, with b ∈ {0, 1} S . This produces the outputî ∈ {0, 1} with probability One of the goals of this work is to design the PQC parameters θ A and θ B = {θ B a }â ∈{0,1} S at Alice and Bob such that the estimated state indexî ∈ {0, 1} at Bob equals the true state index i with high probability. We specifically focus on protocols with a single qubit pair, i.e., S = 1, as studied in [8], in Section 3.2, and with two qubit pairs, i.e., S = 2, in Section 3.3.

Performance Metrics
Assuming that the two states ρ 0 and ρ 1 are selected a priori with equal probability, the average success probability is computed as This probability is a function of the PQC parameters θ A and θ B at Alice and Bob, respectively. We are interested in the problem of maximizing the average success probability Problem (23) requires a search over the space of |θ A | + ∑â ∈{0,1} S |θ B a | PQC parameters, where |θ| represents the size of the vector θ. This search can be carried out using standard optimization techniques, such as gradient descent.
We now discuss two upper bounds on the average success probability (22), namely the Helstrom bound and the PPT bound.

Helstrom Bound
Assume that all S qubit pairs were available at a central node that could perform global measurements on all qubits. The maximum probability of successful detection in this system provides an upper bound on the probability of success for the distributed system under study. Allowing for a general positive operator valued measure (POVM), this approach yields the Helstrom bound [33,34] where H 1 represents the l 1 -norm of the Hermitian matrix H, which is defined as the sum of the absolute values of the eigenvalues of matrix H.

Positive Partial Transpose (PPT) Bound
A tighter bound is obtained by restricting the type of measurements that are allowed at the central node having access to all the S qubit pairs. In particular, such restriction can be defined so as to include as a special case LOCC operations [35]. The resulting PPT bound is obtained as the maximum value of the objective function of the semidefinite program (SDP) [36] max M 0 ,M 1 where M T B i represents the partial transpose of the operator M i [8,37], with respect to the Hilbert space of Bob's qubits. We emphasize that Helstrom and PPT bounds does not depend on the communication between Alice and Bob, as they assume centralized implementation.
In Section 3.4, we will also evaluate the performance of the system illustrated in Figure 8 when the PQC parameters θ A and θ B are optimized by addressing the problem (23) with the correct value of the channel bit flip probability p. Figure 8. Illustration of the LOCCNet protocol [8] for distributed quantum state discrimination, which operates on a single pair of qubits (S = 1).

Noise Aware-LOCCNet
In this section, we introduce the NA-LOCCNet protocol, which operates on S = 2 qubit pairs. There are two main innovations as compared to the LOCCNet protocol: (i) We introduce an ansatz for the PQCs at Alice and Bob based on two-qubit rotation gates that can outperform the separate application of the LOCCNet protocol in Figure 8 to the two qubit pairs; (ii) we propose the direct optimization of the noise-aware performance objective (23), which is capable of adapting to the current classical noise level p, as well as to the quantum noise level γ.
For the PQCs, we adopt the architecture shown in Figure 9, where the two qubit Pauli ZY-rotation gate is defined in (12). Note that the Pauli ZY-rotation gates are followed at Alice, and preceded at Bob, by a controlled NOT (CNOT) gate. This ansatz has been selected through a partial numerical search. We specifically explored other ansatzes with different two qubit and single qubit rotation gates, changing the position of the CNOT gate before and after the rotation gates, and changing the control and target qubits of CNOT gates. The proposed ansatz in Figure 9 returned the best performance among the ansatzes that we considered.
For every value of the noise level γ and bit flip probability p, we propose to optimize the average success probability in (22) over the rotation angles θ A and θ B a , whereâ ∈ {0, 1} 2 . Figure 9. The proposed NA-LOCCNet protocol for distributed quantum state discrimination that operates over S = 2 qubit pairs and adapts to the classical and quantum noise levels p and γ.

Experiments
In this section, we evaluate the performance of the proposed NA-LOCCNet protocols in the presence of a noisy CC link from Alice to Bob. We assume the availability of S qubit pairs, and we consider LOCCNet, reviewed in Section 3.2, as the benchmark protocol. As discussed in Section 3.3, LOCCNet applies separately to the two qubit pairs, while the proposed NA-LOCCNet operates jointly on the two qubit pairs. LOCCNet is designed, for S = 1, as in [8], by setting p = 0 in the optimization problem (23), and we also evaluate the performance of the LOCCNet architecture in Figure 8 when the optimization is conducted by accounting for the actual value of p. We label this scheme as NA-LOCCNet (S = 1), since the design is noise aware. Optimization is conducted using the Adam gradient descent optimizer [28], with 0.01 learning rate and 1000 iterations. As performance bounds, we show the PPT bounds described in Section 3.1.2, which are tighter than Helstrom bounds, for both the cases S = 1 and S = 2. Figure 10 plots the average success probability (22) as a function of the bit flip probability p of the noisy CC link by fixing the noise parameter of the AD channel to γ = 0.8; while Figure 11 plots the same quantity as a function of the noise parameter of the AD channel γ by fixing the bit flip probability of noisy CC to p = 0.25. In both figures, we use red lines for single-pair protocols, i.e., S = 1, and blue lines for two-pair protocols, i.e., S = 2. Figure 10 shows that, as the bit flip probability p of noisy CC increases, the proposed NA-LOCCNet protocol vastly outperforms LOCCNet and NA-LOCCNet (S = 1). Specifically, the performance of LOCCNet reduces linearly as p increases, whereas the proposed NA-LOCCNet is significantly more robust to the communication noise. Note that, as suggested by comparing the PPT bounds with S = 1 and S = 2, the performance gain for p = 0.5, i.e., for a completely noisy CC link, stems from the joint processing of two qubit pairs.  The advantages of NA-LOCCNet are further validated by Figure 11, which demonstrates the gains of NA-LOCCNet at all values of the noise parameter of the AD channel γ. Interestingly, the probability of success first decreases and then increases as a function of the noise strength γ. To explain this behavior, consider the case p = 0.5 of a fully noisy CC link and assume that Alice does not perform any operation on her qubits. In this case, Bob needs to distinguish ρ ⊗2 0 and ρ ⊗2 1 based solely on the local states tr A (ρ ⊗2 0 ) and tr A (ρ ⊗2 1 ), where tr A (·) represents the partial trace operation with respect to the qubits at Alice. The maximal probability of success for detection at Bob is given by the Helstrom bound (24) as The probability of success (27) takes the minimal value 0.5 when there is no AD quantum noise, i.e., when γ = 0, since in this case we have tr A (ρ ⊗2 0 ) = tr A (ρ ⊗2 1 ) = 0.5I 4 . In contrast, at the other extreme, when γ = 1, we have tr A (ρ ⊗2 0 ) = 0.5I 4 and tr A (ρ ⊗2 1 ) = |0 0|, and hence the probability of success (27) is given by P succ = 0.75 > 0.5. This argument suggests that, when the CC noise level p is sufficiently large, the presence of an entanglementbreaking channel can be instrumental in improving the detection performance achievable via LOCC.

Conclusions
In this paper, we have studied the problems of the distributed entanglement distillation and distributed quantum state discrimination in the presence of noisy classical communications. Specifically, we have proposed to train PQCs at the two parties so as to maximize the average fidelity in the entanglement distillation and average success probability in the quantum state discrimination. Simulation results have confirmed the advantages of the proposed NA-LOCCNet over the existing protocols designed for noiseless classical communications. Future work in entanglement distillation may involve the integration of the proposed scheme into a network protocol for entanglement distillation [38]. For quantum state discrimination, it was observed that quantum entanglement-breaking noise on the observed system can be advantageous to improve the detection capacity when classical communication is noisy. Further increasing the number of qubit pairs (S > 2) may result in better protocols, and is a direction for future research.